Biasing in Gaussian random fields and galaxy correlations 
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ABSTRACT 



In this letter we show that in a Gaussian random field the correlation length, 
the typical size of correlated structures, does not change with biasing. We 
interpret the amplification of the correlation functions of subsets identified 
by different thresholds being due to the increasing sparseness of peaks over 
threshold. This clarifies an long-standing misconception in the literature. 
We also argue that this effect does not explain the observed increase of the 
amplitude of the correlation function ^ (r) when galaxies of brighter luminosity 
or galaxy clusters of increasing richness are considered. 



Subject headings: galaxies: general; galaxies: statistics; cosmology: large-scale 
structure of the universe 

We first explain, in mathematical terms, the notion of biasing for a Gaussian random 
field. Here we follow the ideas of Kaiser (1984 which have been developed further in Bardeen 
et al., 1986). We then calculate biasing for some examples and we clarify the physical 
meaning of bias in the context of Kaiser (1984). Finally, we comment on the significance of 
our findings for the correlations of galaxies and clusters. 

We consider a homogeneous, isotropic and correlated continuous Gaussian random 
field, 5(x), with mean zero and variance = (5(x)^) in a volume V. The application of the 
following discussion to a discrete set of points is straightforward considering the effect of a 
smoothing length. The marginal one-point probability density function of 5 is 



Using P, we calculate the fraction of the volume V with 5(x) > i/a, Pi(i^) — P{S)dS. 
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The correlation function between two values of S{x.) in two points separated by a 
distance r is given by ^(r) = (5(x)5(x + rn)). By definition, ^(0) = cr^. In this context, 
homogeneity means that the variance, a^, and the correlation function, ^(r), do not depend 
on X. Isotropy means that ^(r) does not depend on the direction n^. An important 
application we have in mind are cosmological density fluctuations, 5(x) = (p(x) — po)/PO) 
where po = (p) is the mean density; but the following arguments are completely general.^ 
Here and in what follows we assume that the average density po is a well defined positive 
quantity. This is not so if the distribution is fractal (Pietronero, 1987). 

Our goal is, to determine the correlation function of local maxima from the correlation 
function of the underlying density field. Like Kaiser (1984) we simplify the problem by 
computing the correlations of regions above a certain threshold ua instead of the correlations 
of maxima. However, these quantities are closely related for values of u significantly larger 
than 1. We define the threshold density, 6^1, (x) by 



Note the qualitative difference between 6 which is a weighted density field, and O^, which just 
defines a set, all points having equal weight. We note the following simple facts concerning 
the threshold density, 6^,, due only to its definition, independently on the correlation 
properties of 5(x): 



^In other words, we assume (5(x) to be a so called 'stationary normal stochastic process' 
(Feller 1965). 

^Clearly, cosmological density fluctuations can never be perfectly Gaussian since p(x) > 
and thus 5(x) > — 1, but, for small fluctuations, a Gaussian can be a good approximation. 
Furthermore, our results remain at least qualitatively correct also in the non-Gaussian case. 




(1) 



(0,) = Pi{i^) < 1 , (^.(x))" = e,(x) 



(2) 
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-1 , 



9,ix) < 9,{x) , P,{u')<P,{u) for iy>u , 



e.'(0)>e.(0) for iy'>iy. 



(3) 



The difference between 9i, for different values of u is called biasing. The enhancement of 
^iy(O) for higher thresholds has clearly nothing to do with how 'strongly clustered' the peaks 
are but is entirely due to the fact that the larger u the lower the fraction of points above the 
threshold {i.e. Pi{v') < Pi{i^) for v' > v). If we consider the trivial case of white Gaussian 
noise (^(r) = for r > 0) the peaks are just spikes. When a threshold va is considered the 
number of spikes decreases and hence ^i/(0) is amplified because they are much more sparse 
and not because they are 'more strongly clustered': we show in the following that also in 
the case of a correlated field (^(r) ^0 for r > 0) the importance of sparseness is crucial in 
order to explain the amplification of iy{r). 

In the context of cosmological density fluctuations, if the average density of matter 
is a well-defined positive constant, the amplitude of ^rnir) of matter distribution is very 
important, since its integral over a given radius is proportional to the over density on this 
scale. 



The scale Ri where (7{Ri) ~ 1, separates large, non-linear fluctuations, from small ones 
(Gaite et al., 1999). It is very important to stress the following point: from the knowledge 
of the functions iv{f) for two different subsets of the density field obtained from two 
different values v and v' of the threshold, it is not possible to predict the amplitude of 
the fluctuations of the original density fleld at any scale if we do not know the underlying 
values z/, v' and a. On the other hand, as wc are going to show, the only feature of the 
original field which can be inferred by the behavior of ^i/(r) is the large scale behavior of 
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the correlation function ^(r), in particular the correlation length (if this length is finite, in 
the statistical physics terminology.) The correlation length Vc can be defined as (Gaite et 



where P{k) is the Fourier transform of ^(r). Note that is independent of any multiplying 
constant in ^(r), so it is not related to its amplitude. This correlation length is that used 
in statistical physics and field theory (Ma 1984), and gives the length scale beyond which 
^(r) decays rapidly to zero (e.g. exponentially). Roughly, this implies that the fluctuations 
of the field are organized in structures up to a scale Vc (Gaite et al. 1999). However, 
in cosmology the correlation length has been defined historically (Peebles 1980) through 
the amplitude of ^(r) by looking at the distance ro at which it is equal to 1. Provided 
that a constant positive density po of the field exists, tq gives the scale beyond which the 
fiuctuations becomes small with respect to po (then it is analogous to the previously defined 
Ri), and hence it provides also the minimal size of a sample of the field giving a good 
estimate of the intrinsic po- The confusion between and tq (see also Gaite ct al. 1999) is 
at the basis of the misinterpretation of the concept of bias, as we are going to show. 

The joint two-point probability density V2{5^5'\r) depends on the distance r between 
X and x', where 5 = 5(x) and 5' = 5(x'). For Gaussian Fields V2 is entirely determined by 
the 2-point correlation function ^(r) (Rise 1954, Feller 1965): 



al. 1999): 




1 V2p(fc) 

2 P{k) 



k=0 



(5) 



(6) 




By definition 
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The probability that both, S and S' are larger than ua is 

roc roo 

P2{iy,r)= / V2{6,6',r)d6d6' = {e,{^)e,{^ + rn)) . (8) 

J ua J va 

The conditional probability that 5(y) > va ^ given (5(x) > z/a, where |x — y| = r, is then 
just P^iv^T^jPiiy). The two-point correlation function for the stochastic variable ^i/(x), 
introduced above can be expressed in terms of Pi and by 

Defining ^c(^) = -Cl^)/^^? we obtain 

Pii'^fiUr) + 1) = ^= / / ffccfc' 

27r Jl - ^2 ii. 



H 2(1 -eKr)) j 

It is worth noting that the amplitude of ^;y(r) does not give information about how large 
the fluctuations are with respect to po, but it rather describes the "fluctuations of the 
fluctuations", that is the fluctuations of the new variable Ou{x.) around its average Pi{v)- 
Similar arguments to those introduced for the original field can now be developed to 
characterize the typical scales of the new set defined by 6,y{x.). In particular, one can define 
a correlation length rc(z/) using the analog of Eq. (|^), by replacing ^(r) with ^i/(r). Like Tc, 
rc(z/) does not depend on any multiplicative constant in ^u{r), i.e. it does not depend on 
the amplitude of C,u{r)- Moreover a 'homogeneity scale' 'ro(z^) can be defined looking at the 
scale at which ^j/(r) = 1 (or alternatively Eq. H). The value of ro(z^) strongly depends on the 
amplitude of 'Ci^(r) and represents the minimal size of a sample of the set giving meaningful 
estimates of the average density Pi{v) and of ro(z/) itself; ro(z^) is the distance at which the 
conditional density P2{i'.,r) / Piiy) begins to flatten towards Pi^v)- We show below that 
while ro(z/) depends strongly on v due to a sparseness effect, rc(z/) is almost constant and 
equal to of the field, i.e. the maximal size of the fluctuations' structures does depend on 
the threshold. 
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Eq. ([T0|) implies, for z/ ^ 1 and for sufficiently large r such that Cc{f^) ^ 1 ( Politzer & 
Wise, 1984): 

Ur) ^ exp (z/2ec(r)) - 1 , (11) 

to lowest non- vanishing order in C,c{^)- If, in addition, i''^C,c{^) <^ 1 we find (Politzer & Wise, 
1984) 

Ur) ^ i^'Ur) . (12) 

This is the relation derived by Kaiser (1984). He only states the condition ^c{r) <^ 1 and 
separately u ^ 1, which is significantly weaker than the required i''^C,c{r) — C,u{r) ^ 1, 
especially around the correlation length where ^ is not yet very small. 

It is important to note that in the cosmologically relevant regime, ^ 1 the Kaiser 
relation (eq.^) does not apply and is actually exponentially enhanced. If this mechanism 
would be the cause for the observed cluster correlation function one would thus expect 
an exponential enhancement on scales where C,cc ^ 1, i.e. R ^ 20/i~^Mpc. This is in 
contradiction with observations (Bahcall & Soneira 1983) !0 

If, within a range of scales, ^(r) can be approximated by a power law, ^ = (r/ro)"''', 
and if the threshold u is such that Eq. (|12D holds, which implies C,u ^ 1? we have 
= {r / ro{h'))~''' . The scales ro(z/) for different biases are related by tq^u') = rQ[i'){y' /uy/'^ . 

^One might argue that non-linearities which are important when the fluctuations are large 
can "rescue" the Kaiser relation (|12D also into the regime > 1. There are two objections 
against this: First of all, as we pointed out above, > 1 does not imply large fluctuations of 
the original density fleld. Actually most cosmologists would agree that on i? ~ 20/i~^Mpc, 
where the cluster correlation function, ^cc ~ 1? fluctuations are linear. Secondly, it seems very 
unphysical that Newtonian clustering should act as to change the exponential relation (|1T]) 
into a linear one (|12D- 
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For that reason Kaiser, who first derived relation (|T^), interpreted it as an increase in the 
"correlation length" ro(z^), which in our language is the homogeneity scale of the set 0jy(x). 

In order to clarify the meaning of the two length scales rc(z/) and ro(z/) we first study 
an example of a Gaussian density field with finite correlation length r^, and which is well 
approximated by a power law on a certain range of scales. The case in which Vc —>■ oo is 
straightforward. We set 

_ exp(-r/rc) 

with -C Tc, represents the smoothing scale of the continuous field, which is 
characterized better in the following, and Vc is approximately the correlation length as 
defined as Eq. (|^). In the region kj^ r r^, ^(r) is well approximated by the power 
law {ksr)~"'. The correlation lengths, rc(z/) for any value of z/, are given by the slope of 
log^^{r) at large r, vs. r, which is clearly independent of bias (Fig]I|). This can also be 
obtained from Eqs. (|TT|,[12|). 



EDITOR: PLACE FIGURE |I| HERE. 

For relatively small values of the threshold, ^ z/c ~ (ksrc)'^^'^ one finds in this case 
'"o(^) ^ and ro(z^) ~ kj^u. On the other hand, if z/ ^ z/^ we have ro(z^) ~ rclog(z/) 
and in this case the statistics is dominated by shot noise (see below). For this reason we 
assume ro(z/) < rc(z/) in the following. We note that in the range of scales r < ro(z/) the 
amplification of ^ui'f') is strongly non linear in u and it is scale dependent: hence if the 
original correlation function ^(r) has a power law behavior, ^^(r) does not for r < ro(z/): 
this is better shown in the case in which Tc oo. In this case the correlation function is 

Clearly on scales k^^ < r < this example does not differ from the above (but of course 
the correlation length is infinite here). The amplification of for this example is plotted 
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in Fig. ^ In order to investigate whether C,u{f) is of the form $,u{r) ~ {f^/f^oi^))^'^'' , we 
plot — (ilog(^,^(r))/(ilog(r) ~ •y^ in Fig ^ Only in the regime where C,u{r) ^ 1, 7,^ becomes 
constant and roughly independent of u. This behavior is very different from the one found 
in galaxy catalogs! 

EDITOR: PLACE FIGURE | HERE. 



EDITOR: PLACE FIGURE | HERE. 

Let us now clarify how the amplification of ^uif^) is related to the increase of the 
peak sparseness with the threshold u. For a Gaussian random field, the mean peak size, 
Dp{v) and the mean peak distance, Lp are respectively (Vanmarcke 1983, Coles 1986): 
Dp{v) ~ Do{ks,rc)/i' and Lp{iy) ~ Do{ks,rc) exp{iy'^/6)iy^'^^^ so that 

Lp/Dp ~ 1/^/3 exp(i/V6) for z/ > 1 , (14) 

Do{ks,rc) is given by 

j^^dkk^P,{k) ^ 
where Pi{k) is the Fourier transform of ^(r) along a line in space (in c? = 1 it coincides 
with P{k)). Eq. (|l^ shows the strong enhancement of the sparseness of peaks (object) 
with increasing u. It is this increase of sparseness which is at the origin of the amplification 
by biasing. In the light of Eqs. (|ll|,|l^,|l^) , we see that increasing u corresponds to a very 
particular sampling of fiuctuations: the typical size of the surviving peaks Dp is slowly 
varying with u while the average distance between peaks Lp is more than exponentially 
amplified, and finally the scale rc(z/), over which the fiuctuations are structured, is 
practically unchanged. 
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We have argued that bias does not influence the correlation length (rc(z/) ^ r^). It 
amplifles the correlation function by the fact that the mean density, -Pi(z^), is reduced 
more strongly than the conditional density, P2{i' r) / Piiy) . According to Eq. (p!T|) , this 
amplification is strongly non-linear in ^(r) (exponential) at scales where v'^^ci'f) > 1 and 
thus iy{r) > 1. 

Consequently, as we want to stress once more, the biasing mechanism introduced by 
Kaiser and discussed in this work cannot lead to a relation of the form C,u'{f) = otv'u^u{f^) 
over a range of scales ri < r < r2 such that 1 < ^v{ri) and ^1/(^2) < 1. But exactly this 
behavior is found in galaxy and cluster catalogs. For example in (Bahcall & Soneira 1983 ) 
or (Benoist et al. 1986), a constant biasing factor a^ii, over a range from about l/i~^Mpc 
to 20/i^^Mpc is observed for correlation amplitudes varying from about 20 to 0.1. We 
therefore conclude that the explanation by Kaiser (1984) cannot be at the origin of the 
difference of the correlation functions observed in the distribution of galaxies with different 
intrinsic magnitude or in the distribution of clusters with different richness. 

This result appears at first disappointing since it invalidates an explanation without 
proposing a new one. On the other hand, the search for an explanation of an observed 
phenomenon is only motivated if we are fully aware of the fact that we don't already have 
one. 

Last but not least, we want to point out that fractal density fiuctuations together with 
the fact that more luminous objects are seen out to larger distances do actually induce a 
increase in the amplitude of the correlation function ^(r) similar to the one observed in 
real galaxy catalogs (Pietronero 1987, Sylos Labini, Montuori & Pietronero 1998). In this 
explanation, the linear amplification found for the correlation function, has nothing to do 
with a correlation length but is a pure finite size effect, and the distribution of galaxies does 
not have any intrinsic characteristic scale. 
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Fig. 1. — Behavior of ^(r) = c^^/[l + (kgr)'^ expi—r/rc) (where 7 = —2, — 0.01 and 
rc = 10) and ^iy(r) are shown for different values of the threshold u in a semi-log plot. The 
slope of Ciy(r) for r ^ 50 is —I/tc, independent of u i.e. the correlation length of the system 
does not change for the sets above the threshold. 

Fig. 2. — Behavior of ^(r) ~ cr^/ (1 + (kgry) (with 7 = —2, — 0.01) and ^i,{r) are shown 
for different values of the threshold u in a, log- log plot. 

Fig. 3. — The behavior of 7i/(r) is shown for different values of the threshold u for the 
correlation function shown in Fig. 2. Clearly 71, is strongly scale dependent on all scales 
where > 1, this is r < 1 in our units. 
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